AITopics | shakespeare dataset

Collaborating Authors

shakespeare dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Contents of the Appendix

Neural Information Processing SystemsApr-30-2026, 05:08:35 GMT

A.1 CIFAR-10 dataset Figure 6 displays test accuracy curves for all six backbone algorithms under three distinct imbalance parameters: 2{ 0.3,1,10}. The results clearly demonstrate that FedNAR outperforms the baselines, particularly in scenarios with imbalanced data. A.2 Shakespeare dataset The experimental results presented in Figure 7 and 8 showcase the outcomes of experiments performed on the Shakespeare dataset. Six backbone algorithms were utilized, with initial weight decay values selected from {10 3,10 4}. These findings serve as evidence that FedNAR, as an adaptive weight decay scheduling algorithm, exhibits effectiveness across various initial weight decay values.

algorithm, artificial intelligence, rfi, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

6cb7246003d556c4d1cbf9c17c392ee3-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 15:36:28 GMT

clipping, dataset, inequality follow, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Proofs for Fat T ailed Federated Learning

Neural Information Processing SystemsAug-15-2025, 15:52:42 GMT

A.1 Proof of FAT-Clipping - PR For notional clarity, we have the following update: Local update: x The first inequality follows from the strongly-convex property, i.e., Assumption 4. (Bounded Stochastic Gradient V ariance) There exists a constant Assumption 5. (Bounded Gradient) There exists a constant We remark that for any stochastic estimator satisfies the above conditions, the above inequalities hold. The proof is the exactly same as that in original proof [18]. Theorem 6. Suppose f is We run a convolutional neural network (CNN) model on CIFAR-10 dataset using FedAvg. CNN architecture is shown in Table 2. To simulate data heterogeneity across clients, we manually The dataset and model are taken from [45]. This implies that the gradient noise is fat-tailed.

clipping, dataset, inequality follow, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Power-of-Two (PoT) Weights in Large Language Models (LLMs)

Elgenedy, Mahmoud

arXiv.org Artificial IntelligenceJun-3-2025

Complexity of Neural Networks is increasing rapidly due to the massive increase in model parameters. Specifically, in Large Language Models (LLMs), the number of model parameters has grown exponentially in the past few years, for example, from 1.5 billion parameters in GPT2 to 175 billion in GPT3. This raises a significant challenge for implementation, especially for Edge devices where memory and processing power are very limited. In this work, we investigate reducing LLM complexity with special type of quantization, power of two (PoT), for linear layers weights and transformer tables. PoT not only provides memory reduction but more importantly provides significant computational reduction through converting multiplication to bit shifting. We obtained preliminary results of PoT quantization on Nano-GPT implementation using Shakespeare dataset. We then extended results to 124-M GPT-2 model. The PoT quantization results are shown to be very promising with cross entropy loss degradation $\approx$[1.3-0.88] with number of bits range [4-6] to represent power levels.

large language model, machine learning, quantization, (18 more...)

arXiv.org Artificial Intelligence

2506.00315

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments

Chadha, Mohak, Jensen, Alexander, Gu, Jianfeng, Abboud, Osama, Gerndt, Michael

arXiv.org Artificial IntelligenceApr-22-2024

Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients while keeping the data decentralized. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies, particularly Function-as-a-Service (FaaS) for FL, can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders. However, current serverless FL systems still suffer from the presence of stragglers, i.e., slow clients that impede the collaborative training process. While strategies aimed at mitigating stragglers in these systems have been proposed, they overlook the diverse hardware resource configurations among FL clients. To this end, we present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client's hardware capacity and dataset size to intelligently prioritize and select clients for each training round, thereby minimizing the effects of stragglers on system performance. We comprehensively evaluate Apodotiko across diverse datasets, considering a mix of CPU and GPU clients, and compare its performance against five other FL training strategies. Results from our experiments demonstrate that Apodotiko outperforms other FL training strategies, achieving an average speedup of 2.75x and a maximum speedup of 7.03x. Furthermore, our strategy significantly reduces cold starts by a factor of four on average, demonstrating suitability in serverless environments.

apodotiko, dataset, international conference, (13 more...)

arXiv.org Artificial Intelligence

2404.14033

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Diego County > Carlsbad (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Federated Learning with Sparsified Model Perturbation: Improving Accuracy under Client-Level Differential Privacy

Hu, Rui, Gong, Yanmin, Guo, Yuanxiong

arXiv.org Artificial IntelligenceNov-15-2022

Federated learning (FL) that enables edge devices to collaboratively learn a shared model while keeping their training data locally has received great attention recently and can protect privacy in comparison with the traditional centralized learning paradigm. However, sensitive information about the training data can still be inferred from model parameters shared in FL. Differential privacy (DP) is the state-of-the-art technique to defend against those attacks. The key challenge to achieving DP in FL lies in the adverse impact of DP noise on model accuracy, particularly for deep learning models with large numbers of parameters. This paper develops a novel differentially-private FL scheme named Fed-SMP that provides a client-level DP guarantee while maintaining high model accuracy. To mitigate the impact of privacy protection on model accuracy, Fed-SMP leverages a new technique called Sparsified Model Perturbation (SMP) where local models are sparsified first before being perturbed by Gaussian noise. We provide a tight end-to-end privacy analysis for Fed-SMP using Renyi DP and prove the convergence of Fed-SMP with both unbiased and biased sparsifications. Extensive experiments on real-world datasets are conducted to demonstrate the effectiveness of Fed-SMP in improving model accuracy with the same DP guarantee and saving communication cost simultaneously.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2202.07178

Country:

North America > United States > Nevada > Washoe County > Reno (0.14)
North America > United States > Texas > Bexar County > San Antonio (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Communication-Efficient Federated Learning via Optimal Client Sampling

Ribero, Monica, Vikalo, Haris

arXiv.org Machine LearningOct-14-2020

Federated learning (FL) ameliorates privacy concerns in settings where a central server coordinates learning from data distributed across many clients. The clients train locally and communicate the models they learn to the server; aggregation of local models requires frequent communication of large amounts of information between the clients and the central server. We propose a novel, simple and efficient way of updating the central model in communication-constrained settings based on collecting models from clients with informative updates and estimating local updates that were not communicated. In particular, modeling the progression of model's weights by an Ornstein-Uhlenbeck process allows us to derive an optimal sampling strategy for selecting a subset of clients with significant weight updates. The central server collects updated local models from only the selected clients and combines them with estimated model updates of the clients that were not selected for communication. We test this policy on a synthetic dataset for logistic regression and two FL benchmarks, namely, a classification task on EMNIST and a realistic language modeling task using the Shakespeare dataset. The results demonstrate that the proposed framework provides significant reduction in communication while maintaining competitive or achieving superior performance compared to a baseline. Our method represents a new line of strategies for communication-efficient FL that is orthogonal to the existing user-local methods such as quantization or sparsification, thus complementing rather than aiming to replace those existing methods.

artificial intelligence, machine learning, threshold, (17 more...)

arXiv.org Machine Learning

2007.15197

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

LEAF: A Benchmark for Federated Settings

Caldas, Sebastian, Wu, Peter, Li, Tian, Konečný, Jakub, McMahan, H. Brendan, Smith, Virginia, Talwalkar, Ameet

arXiv.org Machine LearningDec-3-2018

Modern federated networks, such as those comprised of wearable devices, mobile phones, or autonomous vehicles, generate massive amounts of data each day. This wealth of data can help to learn models that can improve the user experience on each device. However, learning in federated settings presents new challenges at all stages of the machine learning pipeline. As the machine learning community begins to tackle these challenges, we are at a critical time to ensure that developments made in this area are grounded in real-world assumptions. To this end, we propose LEAF, a modular benchmarking framework for learning in federated settings. LEAF includes a suite of open-source federated datasets, a rigorous evaluation framework, and a set of reference implementations, all geared towards capturing the obstacles and intricacies of practical federated environments.

artificial intelligence, learning, machine learning, (14 more...)

arXiv.org Machine Learning

1812.01097

Country:

North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback